Research Article | Open Access
Volume 2025 |Article ID 100039 | https://doi.org/10.1016/j.plaphe.2025.100039

Winter wheat yield prediction using UAV-based multivariate time series data and variate-independent tokenization

Yan Ge,1,2 Zhichang Zhu,1 Shichao Jin,3 Jingrong Zang,4 Ruinan Zhang,4 Qing Li,4 Zhuangzhuang Sun,4 Shouyang Liu,3 Huanliang Xu,1 and Zhaoyu Zhai 1

1College of Artificial Intelligence, Nanjing Agricultural University, Nanjing, 210095, China
2College of Engineering, Nanjing Agricultural University, Nanjing, 210031, China
3Plant Phenomics Research Centre, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, 210095, China
4College of Agriculture, Nanjing Agricultural University, Nanjing, 210095, China

Received 
20 Nov 2024
Accepted 
17 Mar 2025
Published
30 Mar 2025

Abstract

The breeding of high-yield wheat varieties is needed to ensure food security. Accurately and rapidly predicting wheat yield at the plot level via UAVs would enable breeders to identify meaningful genotypic variations and select superior lines, thus accelerating the selection of climate-adapted high-yield varieties. Although current prediction models have already utilized multivariate time series data, these models usually adopt a simple concatenation operation to embed all the raw data, resulting in low prediction accuracy. To address these limitations, we propose an improved transformer-based wheat yield prediction model with a variate-independent tokenization approach. The proposed variate-independent tokenization approach facilitates the embedding of 14 vegetation indices and 28 morphological traits via the feature dimension, enabling the learning of variate-centric representations. We also apply a multivariate attention mechanism to evaluate the contribution of each variate and capture the multivariate correlation. Extensive experiments are conducted to verify the effectiveness of our model, including comparisons across 3 nitrogen treatments, 2 years, and 56 wheat varieties. We also compare our model with state-of-the-art approaches. The experimental results indicate that our model achieves the optimal prediction performance, with an R2 of 0.862, surpassing those of the classical recurrent neural network and transformer variants. We also confirm that combining both the vegetation indices and morphological traits is advantageous over using single-source data for the prediction task, achieving an approximately 4 % prediction performance gain. In conclusion, this study provides a novel approach for utilizing an improved transformer model and multivariate time series data to quantitatively predict plot-level wheat yield, thus enabling the rapid selection of high-yield varieties for breeding.

© 2019-2023   Plant Phenomics. All rights Reserved.  ISSN 2643-6515.

Back to top